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' This  report  describes  the  design  and  evaluation  of  classifiers  for  distinguish- 
ing four  types  of  modems:  the  CODEX  LXI-9600,  the  HUGHES  HC-276,  the  PARADYNE 
LSI-96,  and  LENKURT  26-C.  The  data  used  to  develop  these  classifiers  con- 
sisted of  many  digitised  time  sample  waveforms  for  each  modem  and  was  collected 
by  RADC's  Digital  Communication  Experimental  Facility  (DICEF).  With  the 
Waveform  Processing  System  (WPS)  capabilities,  the  Interactive  Processing  Sectloi 
of  the  Information  Sciences  Division  (ISCP)  analysed  this  waveform  data  and 
extracted  an  initial  set  of  eighty  features.  This  initial  set  was  later  • — 
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modified  to  fifty  features.  The  On-Line  Pattern  Analysia  and  Recognition 
System  <OLPARS)  was  then  used  to  develop  a number  of  classifier  designs 
which  are  based  on  different  subsets  of  the  initial  fifty  features. 
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I.  INTRODUCTION 


The  ultimate  objective  of  thie  project  is  to  determine  the  potential 
uee  of  modem  signal  signatures  as  in-service  indicators  of  transmission 
channel  degradation  and  possibly  as  diagnostic  aids.  Toward  this  end, 
digitised  time  samples  of  modem  signals  impaired  by  channel 
perturbations  were  recorded  on  magnetic  tape  In  RADC's  DICEF  (Digital 
Communications  Experimental  Facility) . This  objective  was  broken  up 
into  two  parts: 

a.  Modem  Identification. 

b.  Measurement  of  the  Channel  Perturbations. 

The  Waveform  Processing  System  (WPS)  and  the  On-Line  Pattern 
Analysis  and  Recognition  System  (OLPARS)  were  used  to  Identify 
algorithms  for  the  modem  Identification.  Measurement  of  the  channel 
perturbations  are  not  covered  In  this  report. 

Specifically,  the  following  sequence  of  tasks  was  employed: 

a.  Data  Collection 

b . Data  Analysis 

c.  Feature  Hypothesis 

d.  Feature  Extraction 


Feature  Evaluation 


f.  Classification  Logic  Design 


g.  Testing  classification  logic  designs  with  Independent  Test 

Data. 

The  classifier  designs  discussed  In  this  report  are  based  entirely 
on  the  data  collected  by  RADC's  DICEF. 
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II.  BACKGROUND  ON  FACILITIES  USED 


A.  The  Waveform  Processing  System  (WPS). 

The  Waveform  Processing  System  Is  an  interactive,  graphics-oriented 
computer  system  for  the  extraction  of  features  from  digitized  waveform 
data  and  the  analysis  of  a digitised  waveform  data  base.  Its  chief 
purpose  is  to  provide  the  analyst  with  a library  of  mathematical 
algorithms  and  display  options  he  can  call  upon  from  the  display  console 
so  that  he  can  design  and  evaluate  feature  extraction  techniques  for 
waveform  pattern  recognition  problems.  Once  a set  of  features  has  been 
extracted  from  each  of  the  members  of  a waveform  data  base,  the  analyst 
can  Input  them  into  the  OLPARS  System  to  begin  the  pattern 
classification  logic  design  phase  of  the  problem  solution. 

The  Waveform  Processing  System  is  implemented  on  a DEC  PDP-11/45 
Computer  with  a Vector  General  dlslay  and  control  console,  and  a 
Tektronix  4002  storage  tube  with  a hardcopy  unit  for  hardcopying 
selected  Vector  General  displays. 

The  system  includes  its  own  executive  software,  filing  system, 
display  package,  and  a library  of  application  programs.  A feature 
extraction  language  allows  the  analyst  to  construct  his  own  algorithms 
for  waveform  processing  and  feature  extraction. 

The  input  to  WPS  is  in  the  form  of  digitized  waveform  data.  The 
system  is  built  as  a series  of  overlays  which  are  callable  by  the 
operator  from  a menu  which  is  displayed  to  him  on  the  Vector  General 
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CRT.  The  date  In  the  fora  of  data  trees  is  available  to  the  analyst  by 
means  of  utilising  the  Interactive  devices  on  the  Vector  General 
console. 


B.  The  On-Line  Pattern  Analysis  and  Recognition  System  (OLPARS) . 


OLPARS  is  an  interactive,  graphics-oriented  computer  system  for  the 
solution  of  pattern  analysis  and  pattern  classification  problems. 


OLPARS  is  resident  on  two  systems.  One  version  is  on  the  PDP-11/45 
Computer  under  UPS.  This  is  a single  user  system  employing  high 
performance  interactive  graphics,  and,  as  a module  under  UPS,  provides 
for  ease  of  interaction  between  the  feature  hypothesis  mode  conducted 
under  WPS  and  a rapid  testing  of  these  hypothesis  under  OLPARS. 

However,  since  this  system  resides  on  a minicomputer,  there  are  core 
limitations  in  terms  of  the  size  of  the  data  base  which  can  be 
processed . 

A second  version  of  OLPARS  is  implemented  on  the  HIS  6180  Computer 
under  the  JfULTICS  Operating  System. 

Both  versions  of  OLPARS  include  their  own  executive  software,  filing 
system,  display  package,  and  software  modules  for  feature  evaluation, 
vector  data  structure  analysis,  measurement  transformation,  and 
classifier  logic  design. 


C.  Digital  Coranunications  Experimental  Facility  (DICEF) . 


The  Digital  Conxnunications  Experimental  Facility  (DICEF)  is  a unique 
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laboratory  dedicated  to  data  acquisition  and  analysis,  research  and 
development  in  support  of  digital  communications.  This  facility 
provides  a combination  of  programmed  data  reduction  capabilities  and  a 
variety  of  in-place  real  and  simulated  channels  to  allow  a wide  choice 
of  equipment  and  media  experiments.  Media  simulators  provide 
controlled,  repeatable  channel  conditions  essential  to  conduct  valid 
comparative  analysis  of  communications  equipments.  Units  evaluated  in 
this  mode  are  subjected  to  numerous  combinations  of  known  perturbations 
which  can  be  controlled  in  a completely  deterministic  manner. 

Regardless  of  whether  real  or  synthetic  media  are  used,  correlation  of 
error  performance  to  channel  characteristics  can  be  obtained.  This 
capability  is  provided  by  the  heart  of  the  facility  - a high-speed 
comnunlcatlons  processor,  the  9303  Message  Switch,  which  operates  at  any 
data  rate  up  to  ten  megabits  per  second.  The  communications  processor 
possesses  the  critical  attribute  of  a high-speed  I/O  so  that  all 
information  regarding  high  data  rate  channels  can  be  acquired  and 
manipulated  in  real  time. 


III.  DATA  COLLECTION 


Several  digitized  tine  saaples  of  nodan  signals  Inpair ed  by 
telephone  channel  perturbations  wars  recorded  < magnetic  tape  In  RADC's 
DICEF  (Digital  Conunlcatlons  Experimental  Facility) . A block  diagram 
of  the  test  configuration  la  shown  in  Figure  I.  A 2047  bit  maximal 
length  psuedorandom  digital  sequence  generator  was  used  as  the  data 
source.  The  unit  used  wad  II  Corporation's  »pr  901  with  an  RS-232 
output  compatible  with  all  modems  tested.  The  analog  output  of  the 
modem  was  then  applied  to  DICEF' s Wireline  Channel  Simulator.  The  two 
technical  manuals  describing  the  operation  of  this  simulator  are  listed 
as  References  4 and  5. 


All  frequency  components  beyond  half  the  sampling  rate  were  removed 
by  the  anti-aliasing  low  pass  filter  portion  of  the  Spectral  Dynamics 
SD-360  Digital  Signal  Processor.  The  analog  signal  was  then 
Analog-to-Dlgltal  converted  using  a 12-bit  converter  with  a 12.8  KHz 
sampling  rate.  The  digital  samples  were  recorded  by  DICEF 's  9303 
processor  on  Its  7-track  magnetic  tapes.  These  tapes  were  subsequently 
converted  to  9-track  WPS  compatible  tapes  by  RADC/ISCP. 

Following  is  a short  technical  description  of  the  four  modems 
employed: 

a.  Codex  LSI-9600,  Double  Sideband  Suppressed  Carrier 
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Quadrature  Amplitude  Modulation,  2400  baud  4-blt  aanplas  at  9600  B/S, 
1706  Ha  Carrier. 

b.  Hughes  HC-276  (MD-823) , Differential  Phase  Shift  Keying, 
4-phase  keying  of  1800  Hs  carrier,  2400  B/S. 

c.  Paradyne  LSI-96,  Full  Response,  Pulse  Amplitude  Modulation 
transnitted  as  a Vestigial  Sideband  Line  Signal,  -16dB  carrier  at  2851 
Hs  added  in  quadrature,  4 level  4800  baud  at  9600  B/S. 

d.  Lenkurt  26-C,  Synchronous  Frequency  Shift  Keying,  data 
transitions  occur  at  carrier  zero-crossing,  2 carrier  frequencies  - 1200 
and  2400  Hz,  1200  B/S. 

Originally,  the  total  data  consisted  of  52  waveforms  from  the  Codex 
LSI-9600  Modem,  30  waveforms  from  the  Hughes  HC-276  Modem,  35  waveforms 
from  the  Paradyne  LSI-96  Modem,  and  34  waveforms  from  the  Lenkurt  26-C 
Modem.  The  waveforms  were  slightly  more  than  19,800  points  long 
(approximately  1 1/2  sec  at  12.8  KHz  sampling  rate). 

In  order  to  have  more  sample  wave forma,  the  waveforms  were  segmented 
Into  three  equal  parts  of  6144  points  (approximately  1/2  sec  at  12.8  KHz 
sampling  rate),  thus,  providing  three  times  as  many  sample  waveforms  per 
modem. 

These  waveforms  were  divided  Into  two  groups:  Design  Data  and  the 
Test  Data.  The  Design  Data  consisted  of  227  (approximately  50%)  of  the 
453  waveforms  and  were  used  to  design  the  classifier.  The  Test  Data 


consisted  of  th«  remaining  226  waveforms  and  were  later  to  be  used  as  a 


test  of  the  classifier 


follows 


TOTAL  DATA 


Codex  LSI-9600 
Hughes  HC-276 
Paradyne  LSI-96 
Lenkurt  26-C 


156  Waveforms 
90  Waveforms 
105  Waveforms 
102  Waveforms 


453  Waveforms 


TOTAL 


DESIGN  DATA 


MODEM 


78  Waveforms 
45  Waveforms 
53  Waveforms 
51  Waveforms 


Codex  LSI-9600 
Hughes  HC-276 
Paradyne  LSI-96 
Lenkurt  26-C 


227  Waveforms 


TOTAL 


TEST  DATA 


Codex  LSI-9600 
Hughes  HC-276 
Paradyne  LSI-96 
Lenkurt  26-C 


78  Waveforms 
45  Waveforms 
52  Waveforms 
51  Waveforms 


226  Waveforms 


TOTAL 


It  Is  important  to  have  In  mind  that  all  of  the  data  collected  was 


used  for  the  modem  Identification  portion  of  the  effort.  All  but  one  of 


the  waveforms  for  each  modem  was  Impaired  with  varying  degrees  of 


different  types  of  distortions  (phase  jitter , Gaussian  noise*  harmonic 


distortion,  and  combinations  of  these).  As  our  results  will  indicate 


the  features  utilised  were  powerful  enough  to  "see"  through  these 
distortions  and  properly  classify  the  waveforms'  origin  (modem) . 


IV.  SATA  ANALYSIS  AND  FEATURE  EXTRACTION 


A.  General. 

One  of  the  nosC  Important  steps  in  the  waveform  classification 
problem  is  the  analysis  of  graphic  representations  of  digitized 
waveforms  and  their  transformations  for  the  purpose  of  hypothesizing 
measurements  or  features  which  may  aid  in  the  discrimination  of  classes. 
The  importance  of  this  is  attributed  to  the  fact  that  the  quality  of  the 
selected  features  greatly  Influences  the  classifier's  performance. 

After  viewing  the  waveforms  via  WPS,  it  became  obvious  that  suitable 
features  for  classification  purposes  would  be  difficult  to  come  by. 

Sections  of  the  time  domain  waveforms,  with  no  impairments  added, 
for  each  modem,  are  shown  in  Figures  2,  3,  4 and  5. 

Several  waveform  transformations  and  other  techniques  were  suggested 
as  possible  feature  generation  methods.  One  of  these  proved  to  be 
totally  effective  in  the  problem  solution. 


The  heart  of  this  feature  generation  method  is  a nonexhaust ive, 
iterative  procedure  by  which  new,  effective  features  can  be  generated 
from  existing,  less  effective  attributes.  With  this  method,  each 
pattern  is  first  reduced  to  a binary  sequence  by  a suitable  coding 
method . Each  sequence  is  then  described  in  a statistical  sense  by  the 
observed  frequency  of  occurrence  of  certain  selected  binary  words.  The 
categorization  of  the  patterns  is  performed  on  the  basis  of  such 
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observed  values.  Any  pattern  feature  used  with  this  method  Is  the 
observed  frequency  of  occurrence  of  some  specified  binary  word.  The 
method  is,  therefore,  called:  The  Frequency  of  Occurrence  of  Binary 
Word  Method  or  FOBW  Method. 

The  FOBW  Method  was  Implemented  as  follows: 

The  original  waveform,  Wl,  was  transformed  Into  a second 
waveform,  W2,  such  that  W2(i)  ■ TCwi(i)J,  where  T is  the  transformation: 

T[Wl(i>U  - 1 If  Wl(i)  > 0 
- 0 if  Wl(i)  - 0 

Which  simply  says  that,  with  Wl  as  the  original  waveform,  if  the 
magnitude  of  the  i(th)  point  of  Wl  is  greater  than  zero,  then  a one  is 
assigned  to  the  i(th)  point  of  W2;  if  the  magnitude  of  the  l(th)  point 
of  Wl  is  less  than  or  equal  to  zero,  then  a zero  is  assigned  to  the 
l(th)  point  of  W2.  I refer  to  this  transformation  as  "binary 
sequencing"  a waveform.  As  an  example,  Wl  would  be  binary  sequenced  as 
shown  in  Figure  6. 

Mow  what  is  done  is  to  search  for  the  occurrence  of  specific  binary 

k 

words  (chosen  more  or  less  randomly  at  first).  For  example,  the  word 
"11"  occurs  7 times  in  the  example  waveform  W2  (found  by  counting 
adjacent  characters  which  are  both  ones) . 

Finally,  the  concept  of  "delays"  is  Introduced.  If  we  were  to 
search  for  the  occurrence  of  the  word  "11"  with  delay  1,  we  would  look 
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at  the  first  character,  skip  one,  and  look  at  the  third;  then  we  would 
go  to  the  second,  skip  one,  and  look  at  the  fourth;  and  so  on.  The 
occurrence  of  "11"  with  delay  2 would  be  calculated  by  taking  the  first 
character,  skipping  two,  and  taking  the  fourth;  then  the  second 
character  with  the  fifth;  and  so  on.  As  in  our  previous  example,  if  no 
delay  is  present,  adjacent  characters  are  compared.  The  notation  would 
be  as  follows: 

for  W2  - 011110000001111100001 
FOBW(11)0  - 7 
FOBW(ll) 1 - 5 
F0BW(11)2  - 3 
F0BW(11)6  - 2 


for  delays  0,  1,  2,  and  6 respectively. 


In  our  case,  the  waveforms  were  6144  points  long  (approximately  1/2 
sec  at  12.8  KHz  sampling  rate).  They  were  binary  sequenced  and  were 
originally  searched  for  the  occurrence  of  the  binary  words  "11" , "10", 
"01",  and  "00".  The  delays  used  were  0,  10,  20,  30,  40,  50,  60,  .... 
170,  180,  190;  for  a total  of  20  different  delays.  These  were  chosen  at 
random.  Resulting  for  each  waveform  was  an  80  dimensional  (4  words  x 20 
delays)  vector;  each  feature  representing  the  number  of  occurrences  of 
the  given  word  with  a specific  delay. 

As  experience  with  the  FOBW  Method  was  acquired,  it  was  noted  that 
there  was  some  redundancy  in  the  features  obtained  by  the  word  '"11"  with 
those  from  the  word  "00";  and  likewise  with  the  words  "10"  and  "01”  (in 
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general,  this  redundancy  is  not  to  be  expected).  Also,  potentially 
better  delays  vere  arrived  at.  As  a result  of  these  findings,  the 
program  was  modified  to  search  for  the  words  "11"  and  "10"  with  delays 
0,  1,  2,  3,  ...,  23,  24.  Now,  resulting  for  each  waveform  was  a SO 
dimensional  (2  words  x 25  delays)  vector.  A listing  of  the  feature 
extraction  programs  and  all  other  programs  used  are  provided  in 
Appendix  A. 

B.  Specific. 

The  Design  Data  Waveforms  were  binary  sequenced  and  the  feature 
extraction  algorithm  was  executed  on  each  of  the  binary  sequenced 

i 

waveforms.  This  algorithm  searches  for  the  words  "11"  and  "10"  with 
delays  0,  1,  2,  3,  ...,  23,  24.  For  each  waveform,  a 50  dimensional  (2 
words  x 25  delays)  vector  is  computed. 

At  this  stage,  we  have  a tree  that  has  four  nodes  (classes)  with  a 
total  of  227  50  dimensional  vectors.  This  set  of  vectors,  which 
contains  the  extracted  features,  is  then  used  for  the  classifier  design. 
Once  the  classifier  has  been  achieved,  the  TEST  DATA  is  binary 
sequenced,  the  features  are  extracted  and  these  are  then  used  as  a test 
for  the  classifier's  efficiency  In  discriminating  the  four  classes.  The 
design  data  tree  structure  is  as  follows: 
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V.  MEASUREMENT  EVALUATION 


We  are  now  concerned  with  the  discriminatory  qualities  of  our  fifty 
measurements.  In  general,  we  would  like  to  use  the  minimum  number  of 
measurements  that  achieves  a satisfactory  solution.  The  OLPARS  provides 
two  suboptlaal  methods  for  ranking  the  discriminatory  power  of  the 
extracted  features.  Each  of  these  methods  provides  for  three  types  of 
rankings.  The  first  type  uses  a significance  measure  of  a particular 
feature,  xp,  for  discriminating  class  1 from  class  j and  is  designated 
Mlj (xp) . The  second  type  of  ranking  uses  a significance  measure  of  xp 
for  discriminating  class  1 from  all  other  classes  and  is  designated 
Ml(xp) . The  last  type  uses  a measure  of  the  overall  significance  of  xp 
for  discriminating  all  classes  end  is  designated  M(xp) . 


The  first  method  on  the  OLPARS  for  ranking  features  is  the 


discriminant  measure  which  is  useful  when  the  class  conditional 


probability  distributions  are  unimodal.  These  discriminant  measure*. 


using  feature  xp,  are  defined  as  follows 


Mij (xp) 


C(Ni-l)(Op(i))Z  + (Nj-1) (8p(j))  j 


where  xp(j)  “ the  estimated  mean  of  class  j along 


the  estimated  standard  deviation  of  class  j 


k 


* 
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The  discriminant  measure  for  differentiating  class  1 from  all  other 
classes  using  measurement  xp  Is  defined  as: 

K 

Ml(xp)  - E Mij(xp) 

JM 

Finally,  the  discriminant  measure  for  distinguishing  all  classes 
using  measurement  xp  is  defined  as: 

K K K 

M(xp)  - E Ml(xp)  • E E Mlj(xp) 

1-1  1-1  J*i 

where  K - the  number  of  classes. 

The  other  OLPASS  feature  evaluation  method  Is  the  probability  of 
confusion  measure.  It  la  /slid  for  any  probability  distribution  since 
It  essentially  measures  the  overlap  of  the  class  conditional 
probabilities . 

Sines  the  functional  forma  of  the  class  conditional  probabilities 
are  not  known,  OLPARS  estimates  the  marginal  class  distributions  using 
the  sample  data.  The  range  for  measurement  xp  Is  divided  into  cells  of 
width  A.  The  probability  that  a sample  from  class  j will  occupy  the 
r(th)  cell  along  the  range  of  measurement  xp  Is  given  by: 

Frp(j)  -r  P(xp/Cj)  dxp 
Jr(th)  cell 

The  probability  of  confusion  measures  are  defined  as  follows: 
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The  pelnrlse  measure  for  differentiating  dess  1 from  class  J 
can  be  computed  by: 

NP 

Mij(xp)  -E  min  (Prp(l),  Prp(J)) 
r-1  l.J 

The  measure  for  differentiating  class  1 from  all  other  classes 
using  measurement  xp  Is  defined  by: 

K 

Mi(xp)  - E Mlj(xp) 

l*i 

Finally,  the  overall  measure  of  significance  of  measurement  xp 
for  differentiating  all  classes  Is  computed  as  follows: 

K K K 

M(xp)  - E Mi(xp)  - E E MiJ(xp) 

1-1  1*1  J*i 

The  ranking  of  the  extracted  features  based  on  these  evaluation 
techniques  provides  the  Information  required  to  rationally  choose 
Initial  subsets  of  the  fifty  features  for  logic  design. 


VI.  LOGIC  DESIGN 


Logic  design  is  an  iterative  process  in  which  many  designs , based  on 
modified  versions  of  the  initial  feature  subsets,  are  generated  and 
tested.  Features  which  appear  to  discriminate  between  the  more 
troublesome  classes  are  added,  while  superfluous  features  which  rank 
high  for  the  same  easily  discriminated  classes  are  eliminated. 

The  logic  for  the  classifiers  for  this  pattern  recognition  problem 
are  based  on  two  approaches:  the  Pairwise  Fisher  Linear  Discriminant 
Technique  and  User-Defined  Logic  based  on  coordinate  vector  projections. 
In  the  Pairwise  Fisher  Linear  Discriminant  Technique,  for  each  pair  of 
classes  i and  j a unit  vector  dlj  is  computed  such  that  projections  of 
the  data  onto  dij  maximize  the  ratio  of  the  between-class  scatter  to  the 
wi thin-class  scatter.  The  direction  dij  which  maximizes  this  ratio  is 
given  by: 

dij  -aw_1ij^ij 

where  Wij  - (Ni  - l)Ci  + (Nj  - l)Cj 

Cl  ■ Estimated  Covariance  Matrix  for  class  1 

-jy 

yi  ■ Estimated  mean  vector  of  class  1 

A* 

Ni  ■ Number  of  vectors  in  class  i 
and  a - Normalizing  constant  so  that  1^1  - 1 

OLPARS  computes  dij  and  an  initial  threshold,  61j,  to  distinguish 
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between  ell  pairs  of  dees.  These  thresholds  nay  be  adjusted,  if 
necessary,  to  obtain  optimal  discrimination  along  each  dlj . 

For  example,  the  Inner  product  of  an  unknown  Input  feature  vector, 
x,  Is  taken  with  the  discriminant  dCH  for  the  pair  consisting  of  CODEX 
LSI-9600  and  HUGHES  HC-276,  and  compared  with  the  threshold  0C1I  for  that 
pair. 

K 

If  <dCH,  x > ■ E xi  dCH  > 8CH  - Increment  the  counter  for  the 

*•»  V /V/ 

CODEX  LSI-9600  class. 

K 

If  <dCH,  x > ■ I xi  dCH  < 0CH  - Increment  the  counter  for  the 

~ A* 

HUGHES  HC-276  class. 

K 

If  <dCH,  x > ■ E xi  dCH  ■ 0CH  - Increment  the  counter  for  the 

~ ~ i-i  ~ 

class  with  the  larger  number  of  samples  in  the  design  set. 

After  all  pairwise  decisions  are  made,  a binary  vote  Is  cast  by  each 
comparator  and  the  final  decision  Is  determined  by  the  class  counter 
that  received  the  most  votes.  In  case  of  ties,  the  decision  Is  given  to 
the  class  Involved  In  the  tie  which  has  the  highest  a priori 
probabllty.  The  resultant  classification  scheme  is  diagramed  In 
Figure  7. 

In  user  defined  logic,  the  analyst  participates  in  the  logic  design 
process.  The  vectors  from  the  classes  are  projected  on  a one-  or 
two-space.  If  there  Is  (In  the  analyst's  judgment)  sufficient 
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PAIRWISE  FISHER  LOGIC  - CLASSIFICATION  SCHEME 
FIGURE  7 
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separation  between  classes,  or  between  groups  of  classes,  boundaries  nay 
be  drawn  so  that  the  feature  space  is  partitioned  Into  two  or  three 
regions.  These  regions  are  then  labeled  as  to  the  class  or  classes 
present  In  thaw.  Figure  36  In  Experiment  8 Illustrates  partitioning 
Into  three  regions. 

For  the  one-space  Implementation  of  these  logics,  the  mathematics  Is 
•xtraaely  simple.  The  unlabeled  vector  to  be  classified  Is  projected 
(dot  product)  onto  the  projection  direction  (discriminant);  the  value  of 
this  scalar  Is  then  compared  to  the  value  of  the  boundary  (threshold 
drawn  by  the  user) . All  user  defined  logics  In  this  report  are  in 
one-space  logic,  however,  a two-space  scatter  plot  of  the  logic  designed 
in  one-space  using  features  4 and  27  Is  given  in  Figure  37 . 


VII.  EXPERIMENTAL  RESULTS 


Nina  classifiers  were  designed.  The  first  four  classifiers  are 
based  on  the  pairwise  Fisher  Linear  Discriminant  Technique.  The 
remaining  five  classifiers  are  decision  trees  which  use  one-dimensional 
coordinate  vector  logic  at  each  node.  Figure  8 lists  the  features  used 
for  each  design  (experiment) . 

All  classifiers  were  designed  using  the  Design  Data  set.  These 
classifiers  were  evaluated  with  the  Design  Data  set  and  an  Independent 
Test  Data  set  (these  two  data  aets  are  described  in  a previous  section) . 

The  Design  and  Test  confusion  matrices,  with  their  statistics,  from 
the  resulting  evaluation  of  each  classifier  are  given  for  each 
experiment.  The  histograma  of  the  data  projected  along  the  features  in 
the  decision  trees  using  coordinate  vectors  in  one-space  are  also  given 
for  Experiments  5 through  9.  The  logic  tree  structures  for  these  five 
experiments  are  also  shown.  In  the  case  of  Experiment  8,  the  two-space 
scatter  plot  with  reference  to  the  two  features  used  in  that  experiment 
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VIII.  DISCUSSION  AND  RECOMMENDATIONS 


The  success  the  classifiers  had  in  the  problem  of  modem 
Identification  is  proportional  to  the  quality  of  the  features  provided 
by  the  FOBW  method.  The  question  remains  as  to  what  aspect  of  the 
modem' 8 functioning  these  features  were  reflecting  which  provided  such 
excellent  discriminatory  quality.  One  way  to  approach  this  question  is 
by  obtaining  the  auto-correlatlon  of  a number  of  sample  waveforms  from 
each  modem.  This  approach  is  suggested  by  the  fact  that  the  FOBW  method 
operates  In  a manner  similar  to  that  of  obtaining  an  auto-correlatlon. 
Cross-correlation  between  the  different  modem  waveforms  might  also 
provide  information. 

Another  Important  issue  is  that  the  data  was,  after  all,  collected 
In  a laboratory,  not  in  the  real  world.  To  fully  test  the  performance 
of  the  classifiers  designed  using  the  data  provided,  it  would  certainly 
be  necessary  to  obtain  data  from  the  more  realistic  environment.  Should 
the  classifiers'  performance  drop  after  being  exposed  In  the  field, 
information  provided  by  the  auto-correlations  and  cross-correlations 
obtained  previously  might  suggest  other,  better  features  by  way  of  the 
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APPENDIX  B 

I i 

SAMPLE  50  DIMENSIONAL  VECTOR 
FROM  EACH  MODEM  ( CODEX,  HUGHES, 

PARADYNE.  LENKURT.  RESPECTIVELY  ) 


nig  PiCB  16  BBT  QUALITY  PRACTICABLI 
IHOB  OOPY  NMUSKBD  TO  DDC  


222900006+04 

14380000E+04 

767000086+03 

6?l00000E+03 

118200006+04 

171800006+04 

196800006+04 

189300006+04 

16820000E+04 

146000006+04 

14940000E+04 

|4770000Et04 

1*3400006+04 

15030000E+04 

152900006+04 

155500006+04 

156200006+04 

156100006+04 

153300006+04 

1*4300006+04 

1S260000E+04 

1*5400006+04 

1*6900006+04 

156000006+04 

155000006+04 

854000006+03 

16450000E+04 

231600006+04 

2«l20000E+04 

19010000E904 

136400006+04 

10930000E+04 

i 18700006+04 

139800006+04 

|6000000Ef04 

158600006+04 

160200006+04 

154400006+04 

157400006+04 

15470000E+04 

1S2100&0E+04 

1*1400006+04 

1*1500006+04 

1*4300006+04 

1*3300006+04 

1*4800006+04 

1*2200006+04 

1*0700006+04 

l5l50000Et04 

152400006+04 


•216500006+04 

• 142400006+04 
•18130000E+04 
•111000006+04 
•139000006+04 
•145200006+04 
• 135300006+04 
•131600006+04 
•14210000E904 
• 151300006+04 
• 148900006+04 
•l4|40000Et04 
•143500006+04 
•1*1400006+04 
•1S590000E+04 
•1*3600006+04 
, 15080000E+04 
,1*0800006+04 
•1*3000006+04 
• 1*1900006+04 
• 147600006+04 
•14530000E+04 
•144300006+04 
•143300006+04 
,145800006+04 
• 850000006+03 
•1*9000006+04 
.200000U0E+04 
• 190200006+04 
• 162100006+04 
.155900006+04 
•165800006+04 
•16940000E+04 
• 158800006+04 
•14950000E+04 

• IS  1900006+04 
•15940W00E+04 
•1S730000E+04 
.149400006+04 
t|4490000Et04 
.147200006+04 
• 149900006+04 
,1490000064-04 
• 147500006+04 
, 148600006+04 
,152900006+04 
•15520000E+04 
•1*6200006+04 
,157100006+04 
,154500006+04 


•230400006+04 
•16220000E+04 
, 118300036+04 
•112200006+04 
•13100000E+04 
•141400006+04 
•13930000E+04 
•13800000E+04 
,144900006+04 
,151800006+04 
•1S270000E+04 

• 153500006+04 
•156400006+04 
.166600006+04 
•168800006+04 
•160800006+04 

• 152200006+04 

• 150900006+04 

• 153900006+04 
•156200006+04 

• 157100006+04 
.157500006+04 
•157300006+04 
•1S560000E+04 
,154400006+04 
•77900000E+03 
•146000006+04 
•189800006+04 
•195800006+04 
•176100006+04 

• 166500006+04 
•168600006+04 
,169900006+04 

• 163000006  + 04 
•1S610000E+04 
•155200006+04 

• 154400006+04 
•149400006+04 

• 141 10000E+04 
•138800006+04 
.146700006+04 

• 155200006+04 
•156400006+04 
•153300006+04 
,150900006+04 
•150000006+04 
•149600006+04 
•149800006+04 

• 151500006+04 
•152600006+04 


•20690000E+04 

• 127900006+04 
•79800000E+03 
•8S600000E+03 

• 1 12700006+04 

• 129200006+04 
•12270000E+04 
•114800006+04 
•1S770000E+04 
•193000006+04 
•194500006+04 
•160200006+04 

• 120700006+04 

• 109100006+04 

•117200006+04 

• 126200006+04 
•113100006+04 
• 10720000E+04 
•128000006+04 

• 166300006+04 
.195500006+04 

• 190900006+04 

• 153800006+04 

• 119800006+04 

• 112900006+04 
•860000006+03 
•165000006+04 
•213100006+04 

• 207200006  + 04 

• 180000006+04 

• 163400006+04 

• 169900006+04 

• 177800006+04 

• 134900006+04 
« 995000006+03 
•980000006+03 
•132300006+04 
•171800006+04 

• 183400006+04 

• 175300006+04 
•166200006+04 
• 179200006+04 

• 185000006+04 

• 164100006+04 
•125800006+04 
•966000006+03 
•101200006+04 
•138300006+04 

• 172300006+04 
•179200006+04 
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